課程名稱 |
資料科學中的維度縮減 Dimension Reduction for data science |
開課學期 |
108-1 |
授課對象 |
理學院 數學研究所 |
授課教師 |
李克昭 |
課號 |
MATH5189 |
課程識別碼 |
221 U8550 |
班次 |
|
學分 |
2.0 |
全/半年 |
半年 |
必/選修 |
選修 |
上課時間 |
星期一1,2(8:10~10:00) |
上課地點 |
天數102 |
備註 |
總人數上限:30人 |
Ceiba 課程網頁 |
http://ceiba.ntu.edu.tw/1081MATH5189_ |
課程簡介影片 |
|
核心能力關聯 |
本課程尚未建立核心能力關連 |
課程大綱
|
為確保您我的權利,請尊重智慧財產權及不得非法影印
|
課程概述 |
Dimension reduction for data science
The discipline of data science requires transparency to strive. Problem-solving in data science often employs multiple methods packaged in one or more bundles by algorithm inventors or software developers. Due to the complexity of the data in terms of volume, dimension, structure and mode, new methods tend to contain existing methods as the inner layers. This causes the danger of gradual loss in transparency to both users and future developers.
Dimension reduction (DR) is a key component in many data-analytic algorithms or software packages for application in data science. Principal component analysis(PCA), being implemented as an inner layer in numerous packages, has become a household name in lieu of dimension reduction across a variety of scientific disciplines. In addition to PCA, many methods for dimension reduction have been developed. The goal of this course is to present a general statistical framework of dimension reduction and discuss the shared and unique properties of different DR methods.
This is a graduate course for students who have an adequate undergraduate-level of statistics, applied mathematics, data science, data engineering, or related training. Exceptionally strong undergraduate students are also welcome to take this course.
The class will meet once a week for one and a half hour
|
課程目標 |
待補 |
課程要求 |
This is a graduate course for students who have an adequate undergraduate-level of statistics, applied mathematics, data science, data engineering, or related training. Exceptionally strong undergraduate students are also welcome to take this course.
The class will meet once a week for one and a half hour.
Grading is based on class participation (20%) and a term project (80%).
A written report of about 10 pages of main context; if you have more to report, put them in Appendix as supplementary information
The last lecture will be given on December 16, 2019
The final report should be submitted before January 5, 2020
There are two options :
Track 1 : selective homework problems, review of reference papers, etc.
Track 2 : data analysis project, new ideas to contribute, new algorithms, etc.
For track 2, team work with no more than 3 participants is allowed.
|
預期每週課後學習時數 |
|
Office Hours |
|
指定閱讀 |
待補 |
參考書目 |
待補 |
評量方式 (僅供參考) |
|
週次 |
日期 |
單元主題 |
第1週 |
9/09 |
The topic of dimension reduction in data science |
第2週 |
9/16 |
The key to dimension reduction : symmetry |
第3週 |
9/23 |
A vibrant world of regression models devoid of the original flavor 失去原味的迴歸模型世界活力更充沛 |
第4週 |
9/30 |
No class this week |
第5週 |
10/07 |
Bias-variance tradeoff : the intertwining relationship between model selection criterion, method of regularization (Tikhonov), LASSO and cross-valuation, compounded by the issue of honesty |
第7週 |
10/21 |
A wide world of classification tasks: the issue of support separability |
第8週 |
10/28 |
Classification Part II : SVM |
第9週 |
11/4 |
Classification Part III : additional notes |
第10週 |
11/11 |
Wherein MA(multivariate analysis) meets with RL(representation learning): DR (dimension reduction) |
第11週 |
11/18 |
A comparison of several matrix decompositions for dimension reduction
|
第12週 |
11/25 |
Sliced inverse regression for dimension reduction: how much information is preserved? |
第13週 |
12/02 |
Principle Hessian Direction (PHD): curvature pursuit |
第14週 |
12/09 |
Liquid association |
第15週 |
12/16 |
Data visualization, Deep learning and Transfer learning |
第16週 |
12/23 |
no lecture |
第17週 |
12/30 |
no lecture |
|